A Portable Programming Interface for Performance Evaluation on Modern Processors
نویسندگان
چکیده
The purpose of the PAPI project is to specify a standard application programming interface (API) for accessing hardware performance counters available on most modern microprocessors. These counters exist as a small set of registers that count events, which are occurrences of specific signals and states related to the processor’s function. Monitoring these events facilitates correlation between the structure of source/object code and the efficiency of the mapping of that code to the underlying architecture. This correlation has a variety of uses in performance analysis including hand tuning, compiler optimization, debugging, benchmarking, monitoring and performance modeling. In addition, it is hoped that this information will prove useful in the development of new compilation technology as well as in steering architectural development towards alleviating commonly occurring bottlenecks in high performance computing.
منابع مشابه
The perfmon2 interface specification
Performance Monitoring Unit, PMU, performance tools, hardware counters, IPF, IA64 Linux, perfmon kernel interface Monitoring program execution is becoming key to achieving world class performance. All modern processors implement a sophisticated set of hardware performance counters to collect a lot of micro-architectural events which are important clues for software optimizations. Yet there is n...
متن کاملBuilding a Diverse and Innovative Workforce
PAPI is a widely used portable library for accessing hardware counters on modern microprocessors. PAPI offers both counting and sampling interfaces, but the sampling interface is extremely limited, consisting of a simple interrupt-driven interface that can periodically report processor state. In the past few years, the hardware and operating systems of modern processors have added support for n...
متن کاملModeling and Performance Evaluation of Multi-Processors Organization with Shared Memories
This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.
متن کاملPerformance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond
With the appearance of the heterogeneous platform OpenPower, many-core accelerator devices have been coupled with Power host processors for the first time. Towards utilizing their full potential, it is worth investigating performance portable algorithms that allow to choose the best-fitting hardware for each domain-specific compute task. Suiting even the high level of parallelism on modern GPGP...
متن کاملQuantitative performance analysis of the SPEC OMPM2001 benchmarks
The state of modern computer systems has evolved to allow easy access to multiprocessor systems by supporting multiple processors on a single physical package. As the multiprocessor hardware evolves, new ways of programming it are also developed. Some inventions may merely be adopting and standardizing the older paradigms. One such evolving standard for programming shared-memory parallel comput...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJHPCA
دوره 14 شماره
صفحات -
تاریخ انتشار 2000